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WHAT IS CLAIMED . 

1 1 . A method Ifor searching a data repository managed by a content provider 

2 to gather indexable metaoata on content at addresses locations at the data repository, 

3 comprising: | 

4 accessing settings capable of being customized by the content provider, wherein 

5 the customized settings provide instructions on how to search the content provider's data 

6 repository; / 

7 accessing content pages at the content provider's data repository; 

8 accessing the content of content pages at the content provider's data repository in 

9 accordance with instructions included in the accessed customized settings; and 

10 generating meltadata from accessed content pages to add to an index of metadata 

1 1 for accessed addressable locations at the data repository. 

1 2. The method of claim 1, wherein the customized settings include 

2 parameters and access methods unique to an arrangement of content in the content 

3 provider's data repository. 

1 3. The method of claim 1, wherein the accessed customizable settings 

2 provide addressable locations at the content provider's data repository provided by the 

3 content provider, wherein accessing the content pages includes accessing the content 

4 pages at the provided addressable locations, wherein metadata is generated for the 

5 accessed content pages. 

1 4. T le method of claim 3, wherein the addressable locations comprise 

2 uniform resource locator (URL) addresses. 

1 5. Tl le method of claim 3, wherein the accessed customizable settings 

2 provide query terms for at least one included addressable location, further comprising: 
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3 for each provided addressable location for which there are query terms, using the 

4 provided query terms at the provided addressable location to obtain query results; and 

5 generating metadata from the obtained query results to add to the index of 

6 metadata for accessed adoressable locations at the data repository. 

1 6. The meth'od of claim 5, wherein the accessed customizable settings further 

2 provide qualifiers for auleast one search temi, fiirther comprising: 

3 for each query term having at least one qualifier, determining whether the query 

4 results for the query term satisfy each qualifier for the query term, wherem the metadata 

5 for the query result is generated if the query resuh satisfies each qualifier for the query 

6 term that generated me query result; and 

7 performing ^ non-qualifying action for each query result that does not satisfy each 

8 qualifier. 

1 7. Thk method of claim 6, wherein the non-qualifying action comprises not 

2 including metadata for the query result in the index. 



1 8. Tfhe method of claim 3, wherein the accessed customizable settings further 

2 provide a password for at least one provided addressable location, further comprising: 

3 using trie provided password to access the content page at the indicated 

4 addressable loi ;ation for which the password is provided. 

1 9, The method of claim 1, wherein the accessed customizable settings further 

2 indicate a rec irsive search setting indicating whether to search hypertext links to linked 

3 addressable I )cations included in the accessed content page, further comprismg: 

4 acces jing a content page at each linked addressable location included if the 

5 recursive sea rch setting indicates to recursively search linked addressable locations, 

6 wherein mets data is generated for each content page recursively accessed at the linked 

7 addressable 1 acations in the accessed content page. 
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10. The method of daim 9, wherein the accessed customizable settings further 
provide prohibited addressable locations at the data repository, wherein metadata is not 
generated for each content paee at a linked addressable location that is one indicated 
prohibited address location. / 

1 1. The method of claim 1, wherein the accessed customizable settings further 
indicate validation checking programs, further comprising: 

executing each validation checking program indicated in the accessed 
customizable settings agajnst each accessed content page; 

generating a validation output result with the validation checking program for 
each accessed content p&ge with each validation checking program describing 
characteristics of the ccmtent page; 

generating metadata from the validation output result to add to the index of 
metadata for accessed/addressable locations at the data repository. 

12. The method of claim 11, wherem the accessed customizable settings 
further indicate at least one parameter to use with at least one validation checking 
program, further comprising: 

using the at past one parameter when executing the validation checking program, 
wherein the validation output result further indicates characteristics of the content page 
related to the at least one parameter used with the validation checking program. 

13. The method of claim 1 1, wherein the accessed customizable settings 
further indicate all least one qualifier to use with at least one validation checking program, 
further comprisinjg: 

determini ag whether the validation output result satisfies the at least one qualifier 
provided with th j validation checking program producing the output result, wherem 
metadata for the output result is included in the index if the output result satisfies the 
qualifier. 
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14. The methbd of claim 13, wherein metadata for the content page at the 
addressable location is ifot included in the index if the validation output result does not 
satisfy the qualifier, 

15. The mefthod of claim 1 , further comprising: 
determining a/format of the accessed content page; 

selecting one/of a plurality parsers capable of parsing the determined format; and 
parsing the cpntent page using the selected parser, wherein the metadata to add to 
the index is generated from the parsed content page. 

^ 1 6. Thef method of claim 1 , further comprising: 

determining a parser capable of parsing an embedded file referenced in the content 



page; 



index. 



parsing the content of the referenced embedded file; and 

generating metadata for the parsed content of the embedded file to add to the 



1 7. the method of claim 14, wherein the embedded file is encoded in a 
multimedia fon lat, 

1 8. ' rhe method of claim 1 , further comprising: 

distribu:ing a collection tool to content providers capable of accessing and 
metidata for content provider data repositories using the accessed 
S(!ttings; and 

metadata data gathered from multiple content providers using the 
0 gather metadata on their data repositories; 



generating 
customizable 

collecti|ig 
collection tool 



19. 

metadata. 



' Tie method of claim 18, further comprising conmiercializing the collected 

I 
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20, The method/ of claim 1 8, further comprising: 

receiving an electronic subscription from content providers to use the collection 
tool and provide metadat 



21. A system for searching a data repository managed by a content provider to 
gather indexable metadajta on content at addresses locations at the data repository, 
comprising: 

means for accessing settings capable of being customized by the content provider, 
wherein the customizen settmgs provide instructions on how to search the content 
provider's data repository; 

means for accessing content pages at the content provider's data repository; 
means for accessing the content of content pages at the content provider's data 
repository in accord^ce with instructions included in the accessed customized settings; 
and 

means for generating metadata from accessed content pages to add to an index of 
metadata for accessed addressable locations at the data repository. 



22. ThQ 
parameters and access 
provider's data repbsitory, 



system of claim 21, wherem the customized settings mclude 
methods unique to an arrangement of content in the content 



23. Thi; 
provide addressab{le 
content provider, 
the content pages 



system of claim 22, wherein the accessed customizable settings 
locations at the content provider's data repository provided by the 
ivherein the means for accessing the content pages includes accessing 
at the provided addressable locations, wherein metadata is generated for 



the accessed content pages 



24, The 
xmiform resource 



system of claim 23, wherein the addressable locations comprise 
locator (URL) addresses. 
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1 25. The system cp claim 23, wherein the accessed customizable settings 

2 provide query terms for at least one included addressable location, further comprising: 

3 means for using the provided query terms at the provided addressable location to 

4 obtain query results for eacp provided addressable location for which there are query 

5 terms; and 

6 means for generatilig metadata from the obtained query results to add to the index 

7 of metadata for accessed addressable locations at the data repository. 

1 26. The system of claim 25, wherein the accessed customizable settings 

2 further provide qualifier for at least one search term, fijrther comprising: 

3 means for detertninmg whether the query results for the query term satisfy each 

4 qualifier for the query term for each query term having at least one qualifier, wherein the 

5 metadata for the query result is generated if the query result satisfies each qualifier for the 

6 query term that generated the query resuh; and 

7 means for pertbrming a non-qualifying action for each query result that does not 

8 satisfy each qualifierl 

1 27. The system of claim 26, wherein the non-qualifying action comprises not 

2 including metadata tor the query result in the index. 



1 28. The system of claim 22, wherein the accessed customizable settings 

2 further provide a password for at least one provided addressable location, further 

3 comprising: 

4 means for using the provided password to access the content page at the indicated 

5 addressable location for which the password is provided. 



1 29. 

2 further indicate 

3 linked addressabi 



e system of claim 23, wherein the accessed customizable settings 
apcursive search setting indicating whether to search hypertext links to 
e locations included in the accessed content page, further comprising: 
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4 means for accessing a content page at each linked addressable location included if 

5 the recursive search setting indicates to recursively search linked addressable locations, 

6 wherein metadata is generated for each content page recursively accessed at the linked 

7 addressable locations/in the accessed content page. 

1 30. The sktem of claim 29, wherein the accessed customizable settings 

2 further provide prohibited addressable locations at the data repository, wherein metadata 

3 is not generated for each content page at a linked addressable location that is one 

4 indicated prohibited address location. 
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31. The /system of claim 21, wherein the accessed customizable settings 
further indicate validation checking programs, further comprising: 

means for executing each validation checking program mdicated in the accessed 
customizable settings against each accessed content page; 

means fon generating a validation output result with the validation checking 
program for eacn accessed content page with each validation checking program 
describing characteristics of the content page; and 

means for generating metadata from the validation output result to add to the 
index of metada ta for accessed addressable locations at the data repository. 



32 

further indicatis 



rhe system of claim 31, wherein the accessed customizable settings 
at least one parameter to use with at least one validation checking 



program, further comprising: 



means 



for using the at least one parameter when executing the validation checking 



program, whe rein the validation output result further indicates characteristics of the 



content page 
program. 



1 elated to the at least one parameter used with the validation checking 
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of claim 31, wherein the accessed customizable settings 
(me qualifier to use with at least one validation checking program, 



33. The syste 
further indicate at least i 
further comprising: 

means for detentiining whether the validation output result satisfies the at least 
one qualifier provided a /ith the validation checking program producing the output result, 
wherein metadata for t^ e output result is included in the index if the output result satisfies 
the qualifier. 

34, The sys tern of claim 33, wherein metadata for the content page at the 
addressable location h not mcluded in the index if the validation output result does not 



satisfy the qualifier. 



35. The s) 
means for 
means for se 

format; and 

means for 

to add to the index is 



stem of claim 21, further comprising: 
detjbrmining a format of the accessed content page; 

cting one of a plurality parsers capable of parsing the determined 



pajrsing the content page using the selected parser, wherem the metadata 
generated firom the parsed content page. 



36. The 
means for 

the content page; 
means for 
means for 

to the index. 



jystem of claim 21, further comprising: 
df itermining a parser capable of parsing an embedded file referenced in 

parsing the content of the referenced embedded file; and 
generating metadata for the parsed content of the embedded file to add 



1 
2 



37, The system of claim 36, wherein the embedded file is encoded in a 
multimedia format. 
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)f claim 22, further comprising: 
means for distributing a collection tool to content providers capable of accessing 
and generating metadata f^r content provider data repositories using the accessed 
customizable settings; and 

means for collectmg metadata data gathered from multiple content providers using 
the collection tool to gather metadata on their data repositories; 



1 39. A program for searching a data repository managed by a content provider 

2 to gather indexable metadata on content at addresses locations at the data repository, 

3 wherein the program comprises code implemented in a computer readable medium 

4 capable of causing a computer to perform: 

5 accessing settings capable of being customized by the content provider, wherein 

6 the customized settings provide instructions on how to search the content provider's data 

7 repository; 

8 accessing content pages at the content provider's data repository; 

9 accessing the content of content pages at the content provider's data repository in 
10 accordance with ins ructions included in the accessed customized settings; and 



11 



1 2 for accessed address 



generating metadata from accessed content pages to add to an index of metadata 



able locations at the data repository. 



1 40, The method of claim 39, wherein the customized settings include 

2 parameters and adcess methods unique to an arrangement of content in the content 

3 provider's data repository. 



1 4 L The program of claun 39, wherein the accessed customizable settmgs 

2 provide addressable locations at the content provider's data repository provided by the 

3 content provide] , wherein accessing the content pages includes accessing the content 

4 pages at the pro vided addressable locations, wherein nietadata is generated for the 

5 accessed conter t pages. 



\ 
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42. The program of claim 41, wherein the addressable locations comprise 



(URL) addresses. 



43. The proferam of claim 39, wherein the accessed customizable settings 
provide query terms for at least one included addressable location, wherein the program is 
further capable of caus ing the computer to perform: 

for each provided addressable location for which there are query terms, using the 
provided query terms at the provided addressable location to obtain query results; and 
generating m( tadata from the obtained query results to add to the index of 



metadata for accessed 



addressable locations at the data repository. 



44. The p "Ogram of claim 43, wherein the accessed customizable settings 
further provide qualifiers for at least one search term, wherein the program is further 
capable of causing tl lC computer to perform: 

for each queiy term having at least one qualifier, determining whether the query 
results for the query term satisfy each qualifier for the query term, wherein the metadata 
for the query result s generated if the query result satisfies each qualifier for the query 
term that generated :he query result; and 

performing i non-qualifying action for each query result that does not satisfy each 
qualifier. 



The pi 



45. 

including metadati 



Togram of claim 44, wherein the non-qualifying action comprises not 
for the query resuU in the index. 



46 Th^ 
further provide a 
program is furthei 

using the 
addressable 



program of clahn 39, wherein the accessed customizable settings 
password for at least one provided addressable location, wherein the 

capable of causing the computer to perform: 
Provided password to access the content page at the indicated 
locatipn for which the password is provided. 
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1 47. The progrkm of claim 39, wherein the accessed customizable settings 

2 further indicate a recurswe search setting indicating whether to search hypertext links to 

3 Imked addressable locations included in the accessed content page, wherein the program 

4 is further capable of causing the computer to perform: 

5 accessing a content page at each linked addressable location included if the 

6 recursive search setting indicates to recursively search linked addressable locations, 

7 wherein metadata is generated for each content page recursively accessed at the linked 

8 addressable locations in the accessed content page. 



1 

2 
3 
4 

1 

2 
3 
4 
5 
6 
7 



48. The program of claim 47, wherein the accessed customizable settings 
further provide prohibited addressable locations at the data repository, wherein metadata 



is not generated for 
indicated prohibitec 



49. The 
further indicate va 



ach content page at a linked addressable location that is one 
address location. 



program of claim 39, wherein the accessed customizable settings 
idation checking programs, wherein the program is further capable of 
causing the compiler to perform: 

executing ( ach validation checking program indicated in the accessed 
customizable settings against each accessed content page; 

generating a validation output resuh with the validation checking program for 
each accessed con :ent page with each validation checking program describing 



8 characteristics of the content page; 

9 generating metadata from the validation output result to add to the index of 
10 metadata for accessed addressable locations at the data repository. 



1 50. Thb 

2 further indicate at 

3 program, wherein 



program of claim 49, wherein the accessed customizable settings 
least one parameter to use with at least one validation checking 
the program is further capable of causing the computer to perform: 
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4 using the at least O] le parameter when executing the validation checking program, 

5 wherein the validation out 3ut resuh further indicates characteristics of the content page 

6 related to the at least one parameter used with the validation checking program. 



1 51, The progrAm of claim 49, wherein the accessed customizable settings 

2 further indicate at least cAie qualifier to use with at least one validation checking program, 

3 wherein the program is ftirther capable of causing the computer to perform: 

4 determining whether the validation output result satisfies the at least one qualifier 

5 provided with the validation checking program producing the output result, wherein 

6 metadata for the outpu^result is included m the index if the output resuh satisfies the 

7 qualifier. 



1 52. The program of claim 5 1 , wherein metadata for the content page at the 

2 addressable location \s not included in the index if the validation output result does not 

3 satisfy the qualifier. 



1 53. The nrogram of claim 39, wherein the program is further capable of 

2 causing the computer to perform: 

3 determiningla format of the accessed content page; 

4 selecting one of a plurality parsers capable of parsing the determined format; and 

5 parsing the content page using the selected parser, wherein the metadata to add to 

6 the index is generated from the parsed content page. 



1 54. The program of claim 39, wherein the program is further capable of 

2 causing the computer to perform: 

3 determinir g a parser capable of parsing an embedded file referenced in the content 



4 page; 



parsing the content of the referenced embedded file; and 




\ 
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6 generating m 5tadata for the parsed content of the embedded file to add to the 

7 index. 



1 55. The 

2 muhimedia format. 



rogram of claim 54, wherein the embedded file is encoded in a 



56, The program of claim 39, further comprising: 

distributing the program to content providers capable of accessing and generating 
metadata for conte it provider data repositories using the accessed customizable settings; 
and 

collecting metadata data gathered fi"om multiple content providers using the 
collection tool to gather metadata on their data repositories; 



1 

2 



57, The program of claim 56, further comprising: 

receivingjan electronic subscription fi'om content providers to use the program to 



3 gather and provide metadata. 



